PCA-HPR: A principle component analysis model for human promoter recognition
نویسندگان
چکیده
We describe a promoter recognition method named PCA-HPR to locate eukaryotic promoter regions and predict transcription start sites (TSSs). We computed codon (3-mer) and pentamer (5-mer) frequencies and created codon and pentamer frequency feature matrices to extract informative and discriminative features for effective classification. Principal component analysis (PCA) is applied to the feature matrices and a subset of principal components (PCs) are selected for classification. Our system uses three neural network classifiers to distinguish promoters versus exons, promoters versus introns, and promoters versus 3' un-translated region (3'UTR). We compared PCA-HPR with three well-known existing promoter prediction systems such as DragonGSF, Eponine and FirstEF. Validation shows that PCA-HPR achieves the best performance with three test sets for all the four predictive systems.
منابع مشابه
AnG-HPR: Analysis of n-Gram based human Promoter Recognition
We describe a promoter recognition method named An-HPR to locate eukaryotic promoter regions and predict transcription start sites (TSSs). We computed n-gram features are extracted and used in promoter prediction. We computed n-grams (n=2, 3, 4, 5) as features and created frequency features to extract informative and discriminative features for effective classification. Neural network classifie...
متن کاملImproving the quality of images synthesized by discrete cosines transform – regression based method using principle component analysis
Purpose: Different views of an individuals’ image may be required for proper face recognition. Recently, discrete cosines transform (DCT) based method has been used to synthesize virtual views of an image using only one frontal image. In this work the performance of two different algorithms was examined to produce virtual views of one frontal image. Materials and Methods: Two new meth...
متن کاملPatterns Prediction of Chemotherapy Sensitivity in Cancer Cell lines Using FTIR Spectrum, Neural Network and Principal Components Analysis
Drug resistance enables cancer cells to break away from cytotoxic effect of anticancer drugs. Identification of resistant phenotype is very important because it can lead to effective treatment plan. There is an interest in developing classifying models of resistance phenotype based on the multivariate data. We have investigated a vibrational spectroscopic approach in order to characterize a...
متن کاملCisplatin Resistant Patterns in Ovarian Cell Line Using FTIR and Principle Component Analysis
Cisplatin is a common chemotherapeutic agent that used for treatment of many solid cancers. Rapid identification of chemotherapy resistance is very important and may lead to effective treatment plan. Spectroscopy techniques, such as infrared spectroscopy, which are sensitive to biochemical composition of samples, have shown potentials to discriminate tissues. Developing in Fourier transform inf...
متن کاملPatterns Prediction of Chemotherapy Sensitivity in Cancer Cell lines Using FTIR Spectrum, Neural Network and Principal Components Analysis
Drug resistance enables cancer cells to break away from cytotoxic effect of anticancer drugs. Identification of resistant phenotype is very important because it can lead to effective treatment plan. There is an interest in developing classifying models of resistance phenotype based on the multivariate data. We have investigated a vibrational spectroscopic approach in order to characterize a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Bioinformation
دوره 2 شماره
صفحات -
تاریخ انتشار 2008